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Although there is a considerable emphasis on inquiry-based, active learning in 
standards documents, curriculum documents, and textbooks, there exists a great deal 
of debate regarding the effectiveness of specific curricular and instructional approaches, 
including kit-based instruction. This study examines the efficacy of science kits in 
improving content knowledge. The method used involved treatment and comparison 
groups composed of 2,299 elementary school students in third, fourth, and fifth grades 
from ten different schools. In all the pairings but one, there were statistical differences 
in favor of the treatment groups or no statistical differences, suggesting that science kits 
enhance students' content understandings. 

Introduction 

Both the National Science Education Standards (NSES) (NRC, 1996) and the 
Benchmarksfor Science Literacy (AAAS, 1993) echo the science education community's 
support for the notion of engaging all students in active, meaningful learning. Such 
learning is offen associated with hands-on instructional strategies and student- 
centered classroom environments; however, many science teachers fail to employ 
such research-supported best practices and instead rely on more didactic, teacher- 
centered methods. The idea of changing teacher and student roles and altering 
learning environments by moving instruction away from more didactic, teacher- 
cenfered forms fo more hands-on, sfudent-centered forms historically served as 
one of the driving forces behind the use of science kits in formal education (NRC, 
2000; Perisi, 1975). Over the past thirty years, however, many have questioned 
the effectiveness of kifs in promoting and facilitating the type of active learning 
supported by reform-based documenfs (Saul & Reardon, 1996). Criticisms include 
fhe inappropriate implementation of kifs in such ways that instruction is rendered 
ineffective (Olguin, 1995; Saul & Reardon, 1996). Others, however, have argued 
the merits of using science kifs on the grounds that they generate greater active 
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participation among students, empower and engage populations that otherwise 
feel disenfranchised, promote positive classroom environments, increase teacher 
content knowledge, increase teacher confidence fo teach science, and provide 
enjoyment for teachers who use them (Gennaro & Lawrenz, 1992; Houston, Fraser, 
& Ledbetter, 2003; Monhardt, Spotted-Elk, Bigman, Valentine, & Dee, 2002; NRC, 
2000; Ward, 1993). 

Research supporting the assertion that science kits increase teacher confidence 
in teaching science was of particular interest to us because we are aware that one of 
the major concerns regarding the teaching of science in elementary schools involves 
low teacher confidence (Rice & Roychoudhury, 2003). Such concern is groimded 
in research reporting that many elementary teachers consider themselves to be 
uninformed concerning scientific content, making their development or choice of 
inquiry-based, hands-on science lessons an experience filled with apprehension 
(NRC, 2000). High anxiety coupled with no tangible external incentives to include 
science in their teaching and high-stakes testing demands in other content areas 
creates an atmosphere where science instruction becomes expendable. 

In response, many teachers, science education specialists, and administrators 
turn to science kits to address the issue of insufficient teacher content knowledge, 
lack of confidence, and concerns about frequency and quality of insfrucfion (NRC, 
2000). Defermining whether these science kits are effective in enhancing student 
achievement provides these stakeholders with the ability to make more informed 
choices regarding their personal and collective investments in a given instructional 
approach. As such, the primary objective of fhis sfudy was fo examine the efficacy 
of the use of science kits in elementary contexts. In particular, we were interested 
in the relationship between an initiative to systemically implement kit-based 
instructional strategies within a large school district and student achievement 
regarding selected science concepts. 

Methods 

Participants included a total of 2,299 elementary school students in third, 
fourth, and fifth grades from ten different schools within a large school district 
in the southeastern United States. Teachers administered researcher-developed 
instruments (Appendices A, B, & C) to all students in their classes. The students in 
all ten participating schools completed the instruments during the same week. 

Research Design 

The five schools that constituted the treatment group had used science kits for 
as many as fhe pasf two years dependent upon the age of the school. Each grade 
level used different kits (e.g.. Science, Technology, and Children [STC], Full Option 
Science System [FOSS], Teaching Relevant Activities for Concepts and Skills 
[TRACS], National Energy Education Development [NEED], and a school system- 
developed kit being piloted) based upon the learning objectives being addressed. 

Selection and implementation of the kits was a decision made solely by the 
school district and was conducted before the researchers began this study. As 
such, the comprehensive articulation of all rationales for the inclusion of specific 
kits remains unknown. Instead, the general rationale provided by the school 
district was that selections were made that conformed fo content objectives in the 
mandated state curriculum and that were age appropriate according to vendor 
recommendations. Furthermore, because no single vendor provided kits for every 
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objective in every grade, the school district assembled a committee comprised of 
administrators and science teachers who made selections from various vendors fo 
organize a group of kits that, in combination, provided comprehensive coverage 
of curriculum content in each grade. 

The committee's selections resulted in the use of four kits in each of fhe three 
grades. Abrief description of each kit and its basic contents is provided. In the third 
grade, the following kits were used: (1) STC - Plant Growth and Development, 
which contains plant seeds, fertilizers, containers, measuring devices, lighting, and 
a teacher's guide; (2) STC - Soils, which contains soils and sediments, containers, 
measuring devices, worms, and a teacher's guide; (3) TRACS - Investigating 
Objects in the Sky, which contains chalk, clay, measuring devices, models, and 
a teacher's guide; and (4) a school system-developed physical science kit being 
piloted, which contains a Lego Dacta Kit, a selection of trade books, a set of plastic 
fools, and a teacher's guide. In the fourth grade, the following kits were used: (1) 
STC - Animal Studies, which contains aquarium and terrarium materials, frogs, 
crabs, plankton, vegetation, containers, measuring devices, and a teacher's guide; 

(2) FOSS - Earth Materials, which contains mineral specimens, evaporating dishes, 
containers, measuring devices, rock specimens, and a teacher's guide; (3) FOSS 
- Magnetism and Electricity, which contains batteries, bulbs, compasses, magnets, 
motors, iron filings switches, wire, and a teacher's guide; and (4) FOSS - Ideas 
and Inventions, which contains mirrors, pens, posters, containers, periscopes, 
textured objects, and a teacher's guide. Lastly, in the fifth grade, the following 
kits were used: (1) STC - Ecosystems, which contains aquarium and terrarium 
materials, fish, snails, algae, vegetation, soil, seeds, containers, measuring devices, 
and a teacher's guide; (2) TRACS - Investigating Weather Systems, which contains 
thermometers, barometers, containers, measuring devices, and a teacher's guide; 

(3) EOSS - Landforms, which contains sediments, maps, photos, containers, 
foam mountains, measuring devices, stream tables/ trays, and a teacher's guide; 
and (4) NEED - Science of Energy, which contains glow sticks, hand warmers, 
chemicals, toys, solar panels, flashlights, thermometers, and a teacher's guide. All 
the kits used share common features: (1) they promote conceptual understanding; 
(2) they promote active learning and exploration; (3) they provide background 
information for teachers; (4) they provide most lesson materials and supplies; 
(5) they include appropriate sequencing of science concepfs; and (6) they have 
undergone extensive field testing by curriculum developers. There are some 
differences among the kits, however, including format, the number of enrichment 
activities, and the inclusion of interdisciplinary curricula. 

The five schools selected as the comparison group were chosen based on 
a number of factors, including composite end-of-grade (EOG) scores on state 
standardized tests, percentage of free/reduced lunch, percentage non-white, 
student population of school, and school scheduling format (i.e., traditional 
vs. year-round enrollment). Selection of factors is based on research regarding 
comparison school equivalency (Campbell & Stanley, 1981; Grossman & Tierney, 
1993; O'Sullivan et al., 2003). Each comparison school was selected to match an 
individual treatment school. None of fhe comparison schools used science kits as 
a regular, systematic part of science insfrucfion. Although data were not available 
to provide frequencies of various insfrucfional sfrafegies used in comparison 
schools, analysis using thin description (i.e., "... a simple reporting of acfs . . ." 
(Denzin, 2001, p. 162) showed fhat typical modes of insfrucfion included lecture, 
independent practice using worksheets, and textbook readings. Table 1 illustrates 
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the pairings and also places the factors in order of significance (from left to right 
with left being the most significant) in the selection of paired schools. 


Table 1. Treatment and Comparison Pairings 


School 

Traditional/ 

Year-Round 

Composite 
Score from 
EOG 

% Free/ 
Reduced 
Lunch 

% Non-White 

Student 

Population 

Treatment 1 

Traditional 

78.8 

29 

59.8 

435 

Comparison 1 

Traditional 

78.7 

32 

45.7 

381 

Treatment 2 

Traditional 

74.0 

35 

48.2 

278 

Comparison 2 

Traditional 

73.1 

34 

32.8 

485 

Treatment 3 

Traditional 

84.6 

25 

49.7 

616 

Comparison 3 

Traditional 

86.1 

24 

36.5 

902 

Treatment 4 

Traditional 

88.4 

29 

36.2 

387 

Comparison 4 

Traditional 

88.5 

21 

31.3 

719 

Treatment 5 

Year-round 

96.7 

7 

33.9 

982 

Comparison 5 

Year-round 

95.0 

4 

19.1 

964 


After we selected the comparison schools, classroom teachers administered 
an assessment instrument designed by the researchers, which contained eight 
items that focused on science content objectives for each of fhe respective grades 
(Appendices A, B, & C). We selected a representative sample of science concepfs 
addressed by bofh the state standards and the science kits. We purposefully 
consfrucfed the items to assess student conceptual understanding constructed from 
experiential learning. We completed face-validity tests for all insfruments. Items 
were examined by two scientists, two science educators, two classroom teachers, 
and three students (one from each grade level). Refinements were made in response 
to suggestions. Tests for validity addressed issues such as the following: (1) content 
addressed in each item conformed fo the state science standards for the targeted 
grade level; (2) scientific content was accurate; (3) the items addressed rich and 
relevant content; (4) the distracters were appropriate; (5) the items discriminated 
between deep understanding and superficial familiarity; (6) the items addressed 
conceptual imderstanding, not memorized facts; (7) the items were appropriate 
for the grade levels addressed, including language; (8) the items were consistent 
with the content taught in the grade levels addressed; (9) the items were clear and 
understandable; and (10) the questions were not too easy or too hard. We scored 
the tests with a scanner, which provided totals for correct responses for each grade 
and school. All data were entered into SAS, and statistical tests were run. 

Data Analysis and Results 

We tested for a significant difference between treatment and control sites by 
considering each pair of matched sites for a particular grade. Because of the 
variation in the number of participants and their performance across schools, we 
did not attempt to combine all treatment sites and all control sites for each grade; 
therefore, we have a separate analysis for all 15 pairs of fhe three grades across the 
five sites. After determining that the data do not fit the traditional assumptions of 
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normality, we chose the Wilcoxon rank-sum test for two independent samples for 
the analysis. Data analysis was conducted with the NPARl WAY procedure in SAS 
software. 

The Wilcoxon rank-sum test for each comparison tests the null hypothesis of no 
difference between the classes of that grade for the pair of matched schools. For 
this test, ranks were tabulated for test scores as if the two classes were combined. 
If the two classes had a sum of ranks from the combined sample that were similar, 
it was assumed that the two classes were not significantly different; however, if the 
sums were statistically different, the null hypothesis of no difference between the 
classes was rejected. 

Table 2 shows the number of participants and the mean for all treatment 
and control sites for Grade 3. The p-value highlighted by an asterisk indicates a 
significant result in favor of the treatment group at the alpha level of .05. 


Table 2. Pairwise Comparisons for Grade 3 



Treatment Site 

Control Site 


P-Value 

N 

Mean 

N 

Mean 

Test Statistic 

Pair 1 

71 

66.00 

61 

67.08 

4,092.00 

.87 

Pair 2 

24 

53.81 

67 

43.20 

1,291.50 

.09 

Pair 3 

82 

93.40 

73 

60.70 

4,431.00 

<.oor 

Pair 4 

46 

63.21 

75 

59.65 

2,907.50 

.58 

Pair 5 

154 

143.41 

149 

160.88 

23,971.50 

.07 


Table 3 shows the number of participants and the mean for all treatment 
and control sites for Grade 4. The p-values highlighted by an asterisk indicate a 
significant result in favor of the treatment group at the alpha level of .05, and the 
p-value highlighted by a double asterisk indicates a significant result in favor of 
the control group. 


Table 3. Pairwise Comparisons for Grade 4 



Treatment Site 

Control Site 

Test Statistic 

P-Value 

N 

Mean 

N 

Mean 

Pair 1 

42 

48.89 

43 

37.24 

2,053.50 

.03* 

Pair 2 

44 

66.60 

66 

48.10 

2,930.50 

<.01* 

Pair 3 

87 

95.70 

106 

98.07 

8,326.00 

.77 

Pair 4 

44 

49.22 

45 

40.88 

2,165.50 

.12 

Pair 5 

138 

117.39 

118 

141.50 

16,696.50 

<.01** 


Table 4 shows the number of participants and the mean for all treatment 
and control sites for Grade 5. The p-values highlighted by an asterisk indicate a 
significant result in favor of the treatment group at the alpha level of .05. 
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Table 4. Pairwise Comparisons for Grade 5 



Treatment Site 

Control Site 


P-Value 

N 

Mean 

N 

Mean 

Test Statistic 

Pair 1 

49 

51.74 

40 

36.74 

1,469.50 

<.01* 

Pair 2 

24 

48.58 

69 

46.45 

1,166.00 

.73 

Pair 3 

100 

106.87 

80 

70.04 

5,603.50 

<.0001* 

Pair 4 

58 

63.54 

72 

67.08 

3,685.50 

.59 

Pair 5 

124 

134.51 

148 

138.17 

16,679.50 

.70 


Overall, the analysis showed a result in favor of fhe treafmenf group for five of 
fhe 15 pairs and in favor of fhe control group for only one pair. 

Conclusions and Implications 

In all fhe pairings buf one fhere were eifher no statistical differences in scores 
or else fhere were statistical differences in favor of fhe treafment groups. These 
results indicate that systemic implementation of science kits is successful in some 
contexts at enhancing student understanding as measured by application-based 
content questions. We acknowledge many variables exist such as frequency of kif 
use, implemenfafion of kits, alternative approaches implemented in comparison 
schools, and teacher and student affective variables, all of which may serve to 
provide further insight into the effectiveness of fhe use of science kits in the 
classroom. We conducted this study within the limits of our resources (i.e., 
funding, time, and access to participants), however, and while additional lines 
of inquiry are necessary fo gain a more complete picture of fhe efficacy of science 
kits, our findings confribufe fo fhe body of knowledge regarding science kif use 
by providing a comparison befween structured systemic use and nonsystematic, 
teacher-selected methods. 

An important implication that stems from our findings involves keeping 
science kifs available to stakeholders as an effective option for student learning. 
The literature includes many studies documenting the capacity of active science 
education (e.g., hands-on learning) as opposed to passive science education (e.g., 
copying notes from fhe board) to improve students' attitudes toward science 
(NRC, 2000). It is reasonable to assume that the students participating in this study 
who engaged in active science education would also demonstrate more favorable 
atfifudes toward science than those involved in passive science education. Logically, 
if content knowledge test scores yielded by a passive and an active approach 
are about the same, the attitude advantage makes the active science education 
approach a better choice. The empirical results from this study suggest that in 14 
out of 15 comparisons, there were improved content understandings and/or an 
inferred aftifude advanfage (Fraser, 1980; Freedman, 1997; Siegel & Ranney, 2003) 
among the treatment groups. 

There is a related implication regarding the inferred atfifude and confidence 
advantages with teachers since the use of science kits has been shown to enhance 
these areas (NRC, 2000; Rubino, Barley, & Jermess, 1994). If teachers exhibit greater 
confidence in fheir science teaching by using kits, it is logical to conclude that a 
systemic implementation of kits in a school district would make a difference for 
feachers who dislike science and/or who lack confidence in teaching science. If 
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teachers replace teacher-centered instructional strategies (e.g., textbook readings) 
with activities that actively engage children, there should be improvement in both 
student understanding of science and their attitudes toward science. Despite the 
challenges (e.g., logistics, teacher resistance) of implementing systemic science 
kit use within school systems, the properties of enhanced content knowledge 
and improved attitudes towards science make them a viable option for effective 
science teaching and learning. 
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Appendix A 


SCIENCE PROGRAM SURVEY 
3"“ GRADE 

Instructions: With your teacher's help. INI In the 

cirdes to show your school code and grade level. 
For questions 1 through 10, darken the circle beside 
your choice for the best answer to the question. 

School Code 

0 o o o 

10 0 0 
2 0 0 0 

3 0 0 0 

4 0 0 0 

5 o o o 

6 0 0 0 

7 0 0 0 

8 0 0 0 

g O O O 

Grade Level 

3 O 

4 O 

5 O 

1. How do bees help flowers produce 
seeds? 

O They make honey to feed the plant. 

O They spread pollen from one flower 
part to another. 

O They flap their wings to keep the 
flower from getting overheated. 

O They clean the plant with their 
tongues. 

2. If you shine a flashlight at a mirror, 
what path does the light take? 

O Most of the light bounces back. 

O Most of the light passes through the 
mirror to the other side. 

O Most of the light goes into the mirror 
and stops. 

O Most of the light bends around the 
mirror. 


3. If you put clay, sand, and water in a 
test tube and shake the test tube up 
and than do not disturb the test tube 
anymore, what is likely to happen? 

O Within a minute, the sand and clay 

both settle to the bottom and the 
water is clear. 

O Within a minute, the sand settles to 
the bottom, but some of the clay stays 
mixed in with the water. 

O After three days, the clay settles to 
the bottom, but the sand stays mixed 
in with the water. 

O After three days, the sand, water, and 
clay stay all mixed up together. 

4. If you set up an experiment with the 
materials below, what might you be 
able to learn? 


t WATIR 



O How heavy the soil is 
O The temperature of the soil 
O How well the soil holds water 
O The amount of soil 


Journal of Elementary Science Education * Spring 2006 * 18(1) 


51 




5. Which answer best describes the 
reading on the thermometer below? 



O 24°F 
O 24°C 
O 85°F 
O -25°C 

6. To show how much a plant 

grows in a certain amount of time, 
what would be a good label for the 
bottom of the graph below? 


How Much A Plant Grows 



O Number of plants in a pot 
O Amount of soil in a pot 
O Number of days of growth 
O Amount of water given each day 


7. Which order of these moon phases 
best shows what happens as the 
moon changes? 

oC (» •) » 

oO • o • 

oO € t) • 

o • ® O » 

8. If you measured the shadow of a tree 
In your yard, at which time would 
the shadow be longest? 

O Noon 

O Early afternoon 
O Mid afternoon 
O Late afternoon 
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Appendix B 


SCIENCE PROGRAM SURVEY 
4“* GRADE 

Instructions: With your teacher's help, fill in the 

circies to show your school code and grade level. 
For questions 1 through 10, daricen the circle beside 
your choic e for the best answer to the question. 

School Code 

0 o o o 

1 o o o 

2 0 0 0 

3 0 0 0 

4 0 0 0 

5 o o o 

6 0 0 0 

7 0 0 0 

8 0 0 0 

9 0 0 0 

Grade Level 

3 O 

4 O 

5 O 

1. If you had a frog In a habitat In your 
classroom, what kind of behavior 
might you be able to observe? 

O how rapid its heart beats 
O how it moves 
O how warm its water is 
O how iarge its tank is 

2. If you listed the parts of a frog's 
habitaL what would you include? 

O length of body, number of legs, type of 
skin 

O size of eyes, location of ears, weight 
of body 

O type of food, amount of space, 
amount of water 

O how it breathes and how it swims 


3. Which one of the electrical circuits 
below will make the light bulb glow? 


SWITCH 



MOTOR 





O A 
O B 
O C 
O D 


4. What can you do to increase the 
strength of an electromagnet made 
with a nail, a battery, and metal wire? 
O use a longer nail 

O insulate the wire with plastic 
O use a C battery instead of a D battery 
O wrap rrwre turns of wire 

5. What Is the difference between rocks 
and minerals? 

O rocks are heavy; minerals are light 
O rocks will not dissolve in water, 
minerals will 

O rocks are made of different 
ingredients: minerals of only one 
O rocks are rough to touch; minerals are 
smooth 
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6. If you dissolve a mineral in a beaker 
of water, what is likely to happen if 
you leave the beaker in the sun for a 
few days? 

O the water will evaporate, leaving the 
mineral behind 

O the water and mineral will both 
evaporate, leaving an empty beaker 
O the water will remain, but the mineral 
will disappear 

O the water and the mineral will both 
remain in the beaker 


7. Which of those statements is ALWAYS 
true about inventions? 

O inventions always start with an 
Idea 

O inventions always help you do 
something faster or better 
O inventions always turn into an idea 
O inventions are always useful 

8. If you wanted to lift a heavy rock with 
a metal bar, which arrangement below 
will make it easiest for you? 

A. 





C 



O A 
O B 
O C 
O D 
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Appendix C 


SCIENCE PROGRAM SURVEY 
5“' GRADE 


Instructions: With your teacher's help, fill in the 

circles to show your school code and grade level. 
For questions 1 through 10, darken the circle beside 
your choice for the best answer to the question. 


0 O O O 

1 O O O 

2 0 0 0 

3 0 0 0 

4 0 0 0 

5 0 0 0 

6 0 0 0 

7 0 0 0 

8 0 0 0 

9 0 0 0 

Grade Level 

3 O 

4 O 

5 O 


1. Which of the following sources of 
energy is considered nonrenewable? 
O hydropower 
O petroleum 
O solar 
O wind 


2. Which of the statements below Is true 
regarding energy transformations? 

O chemical energy is converted to 
mechanical (motion) energy when 
your body uses food 
O radiant energy is converted to 
mechanical (motion) energy when a 
motor turns an airplane propeller 
O potential energy is converted to 
kinetic energy when electrical energy 
is used to heat an oven 
O mechanical (motion) energy is 
converted to potentiai energy when 
windmiils produce electricity 


3. Refer to the topographic map below 
and choose the point of highest 
elevation. 



O A 
O B 
O C 
O D 


4. If you conducted a stream table 
experiment which of these actions 
would you predict to Increase the 
amount of erosion and deposition? 
O use more water 
O use less water 
O make the slope flatter 
O place a barrier in the water path 


5. What is the source of energy that 
drives weather systems? 

O wind 
O water 
O sun 
O rain 
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6. In the diagram representing the water 
cycle below, what process is 
represented by the arrow marked 
“X”? 



8. What would likely happen to your 
aquarium if all the plants and algae 
died? 

O the snails would live, but the fish 
would die 

O the fish would live, but the snails 
would die 

O both the fish and the snails would live 
O both the fish and the snails would die 


O evaporation 
O precipitation 
O condensation 
O elevation 


7. If you construct an aquarium with 
snails, fish, algae, and a water plant, 
what kind of relationship exists in 
your ecosystem? 

O the fish provide food for the water 
plant; the water plant provides carbon 
dioxide for the fish 

O the fish provide oxygen for the water 
plant; the water plant provides shelter 
for the fish 

O the snails provide carbon dioxide for 
the algae; the algae provide food and 
oxygen for the snails 
O the snails provide food for the fish; the 
fish provide oxygen for the snails 
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